Acoustic-to-Articulatory Mapping With Joint Optimization of Deep Speech Enhancement and Articulatory Inversion Models

نویسندگان

چکیده

We investigate the problem of speaker independent acoustic-to-articulatory inversion (AAI) in noisy conditions within deep neural network (DNN) framework. In contrast with recent results literature, we argue that a DNN vector-to-vector regression front-end for speech enhancement (DNN-SE) can play key role AAI when used to enhance spectral features prior back-end processing. experimented single- and multi-task training strategies DNN-SE block finding latter be beneficial AAI. Furthermore, show coupling producing enhanced an trained on clean outperforms multi-condition (AAI-MC) tested speech. observe 15% relative improvement Pearson's correlation coefficient (PCC) between our system AAI-MC at 0dB signal-to-noise ratio Haskins corpus. Our approach also compares favourably against using conventional DSP (MMSE IMCRA) front-end. Finally, demonstrate utility articulatory downstream application. report significant WER improvements automatic recognition task mismatched based Wall Street Journal corpus (WSJ) leveraging information estimated by over spectral-alone features.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Acoustic to articulatory inversion

The context of this work is speech analysis. The subject deals with acoustic-to-articulatory inversion, i.e. the recovery of the temporal evolution of the vocal tract shape from the signal. This topic is important because it is likely to give rise to applications in the domains of speech coding as well as second language learning. Acoustic-to-articulatory inversion relies on an analysis by synt...

متن کامل

Acoustic-to-articulatory inversion in speech based on statistical models

Two speech inversion methods are implemented and compared. In the first, multistream Hidden Markov Models (HMMs) of phonemes are jointly trained from synchronous streams of articulatory data acquired by EMA and speech spectral parameters; an acoustic recognition system uses the acoustic part of the HMMs to deliver a phoneme chain and the states durations; this information is then used by a traj...

متن کامل

A Deep Neural Network for Acoustic-Articulatory Speech Inversion

In this work, we implement a deep belief network for the acoustic-articulatory inversion mapping problem. We find that adding up to 3 hidden-layers improves inversion accuracy. We also show that this improvement is due to the higher expressive capability of a deep model and not a consequence of adding more adjustable parameters. Additionally, we show unsupervised pretraining of the system impro...

متن کامل

Acoustic-to-articulatory inversion mapping with Gaussian mixture model

This paper describes the acoustic-to-articulatory inversion mapping using a Gaussian Mixture Model (GMM). Correspondence of an acoustic parameter and an articulatory parameter is modeled by the GMM trained using the parallel acousticarticulatory data. We measure the performance of the GMMbased mapping and investigate the effectiveness of using multiple acoustic frames as an input feature and us...

متن کامل

Mixture Density Networks, Human Articulatory Data and Acoustic-to-articulatory Inversion of Continuous Speech

Researchers have been investigating methods for retrieving the articulation underlying an acoustic speech signal for more than three decades. A successful method would find many applications, for example: low bit-rate speech coding, helping individuals with speech and hearing disorders by providing visual feedback during speech training, and the possibility of improved automatic speech recognit...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE/ACM transactions on audio, speech, and language processing

سال: 2022

ISSN: ['2329-9304', '2329-9290']

DOI: https://doi.org/10.1109/taslp.2021.3133218